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Abstract 

This paper explores the fundamental properties of distributed minimization of a sum 
of functions with each function only known to one node, and a pre-specified level of node 
knowledge and computational capacity. We define the optimization information each node 
receives from its objective function, the neighboring information each node receives from its 
neighbors, and the computational capacity each node can take advantage of in controlling 
its state. It is proven that there exist a neighboring information way and a control law 
that guarantee global optimal consensus if and only if the solution sets of the local objective 
functions admit a nonempty intersection set for fixed strongly connected graphs. Then 
we show that for any tolerated error, we can find a control law that guarantees global 
optimal consensus within this error for fixed, bidirectional, and connected graphs under 
mild conditions. For time-varying graphs, we show that optimal consensus can always be 
achieved as long as the graph is uniformly jointly strongly connected and the nonempty 
intersection condition holds. The results illustrate that nonempty intersection for the local 
optimal solution sets is a critical condition for successful distributed optimization for a large 
class of algorithms. 
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1 Introduction 



1.1 Motivation 

Distributed optimization is on finding a global optimum using local information exchange and 
cooperative computation over a network. In such problems, there is a global objective function 
to be minimized, say, and each node in the network can only observe part of the objective. 
The update dynamics is executed through an update equation implemented in each node of the 
network, based on the information received from the local objective and the neighbors. 

The literature has not to sufficient extent studied the real meaning of "distributed" opti- 
mization, or the level of distribution possible for convergence. Some algorithms converge faster 
than others, while they depend on more information exchange and a more complex iteration 
rule. For a precise study of the level of distribution for optimization methods, the way nodes 
share information, and the computational capacity of each node should be specified. Thus, an 
interesting question arises: fixing the knowledge set and the computational capacity, what is 
the best performance of any distributed algorithm? In this paper, we investigate the fundamen- 
tal performance limits of distributed algorithms when the constraints on how nodes exchange 
information and on their computational capacity are fixed. We address these limits from a 
dynamical system point of view and characterize some fundamental conditions on the global 
objective function for a distributed solution to exist. 

1.2 Related Works 

Distributed optimization is a classical topic in applied mathematics with several excellent text- 
books, e.g., ISlEllS]. 

Assuming that some estimate of the subgradient for each component of the overall objec- 
tive function can be passed over the network from one node to another via deterministic or 
randomized iteration, a class of subgradient-based incremental algorithms was investigated in 
|40^ [35| HT| H3t 132] • A series of results were established combining consensus and subgradient 
computation. This idea can be traced back to 1980s to the pioneering work |21j . A subgradient 
method for fixed undirected topology was given in |36j . Then in [32], convergence bounds for 
time-varying graphs with various connectivity assumptions were shown. This work was then 
extended to a constrained optimization case in [33], where each agent is assumed to always lie 
in a particular convex set. Consensus and optimization were shown to be guaranteed when 
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each node makes a projection onto its own set at each step. Following the ideas of [33], a ran- 
domized discrete-time algorithm and a deterministic continuous-time algorithm were presented 
for optimal consensus in [33] and [50], respectively, where in both cases the goal is to form a 
consensus within the intersection of the optimal solution sets of the local objective functions. 
An augmented Lagrangian algorithm was presented for constrained optimization with directed 
gossip communication in [34]. An alternative approach was presented in [38], where the nodes 
keep their gradient sum equal to zero during the iteration by utilizing gossiping. 

Dynamical system solutions to distributed optimization problem have been considered for 
more than fifty years. The Arrow- Hurwicz-Uzawa flow was shown to converge to the set of 
saddle points for a constrained convex optimization problem [15]. In [36], a simple and elegant 
continuous-time protocol was presented to solve linear programming problems. More recently, 
in [38], a continuous-time solution having second-order node dynamics was proposed for solving 
distributed optimization problems for fixed bidirectional graphs. In [3S], a smooth vector field 
was shown to be able to drive the system trajectory to converge to the saddle point of the 
Lagrangian of a convex and constrained optimization problem. In [50], a network of first-order 
dynamical system was proposed to solve convex intersection computation problems with directed 
time-varying communication graphs. Besides optimization, a continuous-time interpretation to 
discrete-time algorithms was discussed for recursive stochastic algorithms in |47j . 

Consensus algorithms have been proven to be useful in the design of distributed optimization 
methods [32l [331 [331 [501 [38l [38] . Consensus methods have also been extensively studied for both 
discrete-time and continuous-time models in the past decade, some references related to the 
current paper include [2Il[2Dl[ISl[2a[Ml[231[I3[I51[3ni[3Il[2Sl[2B]. 

1.3 Main Contribution 

This paper considers the following distributed optimization model. The network consists of N 
nodes with directed communication. Each node i has a convex objective function fi : R™" — )■ R. 
The goal of the network is to reach consensus meanwhile minimizing the function "^^Li fi- At 
any time t, each node i observes the gradient of fi at its current state gi{t) and the neighboring 
information nj(i) from its neighbors. The map is zero when the nodes state is equal to all 
its neighbors' state. The evolution of the nodes' states is given by a first-order integrator with 
right-hand side being a control law JT{ni,gi) taking feedback from gi{t) and ni{t). We assume 
Jini-iQi) to be injective in gi when rii takes value zero. 
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The main results we obtain are stated as follows: 

• We prove that there exists a neighboring information rule rij and a control law J guar- 
anteeing global optimal consensus if and only if the intersection of the solution sets of 
/j, z = 1, . . . , A^, is nonempty intersection set for fixed strongly connected graphs. 

• We show that given any e > 0, there exists a control law J that guarantees global optimal 
consensus with error no larger than e for fixed, bidirectional, and connected graphs under 
mild conditions. 

• We show that optimal consensus can always be achieved for time- varying graphs as long as 
the graph is uniformly jointly strongly connected and the nonempty intersection condition 
above holds. 

We conclude that the nonempty intersection of the solution sets of the local objectives seems 
to be a fundamental condition for distributed optimization. 

1.4 Paper Organization 

In Section 2, some preliminary mathematical concepts and lemmas are introduced. In Section 3, 
we formulate the considered optimization model, node dynamics, and define the problem of 
interest. Section 4 focuses on fixed graphs. A necessary and sufficient condition is presented 
for the exact solution of optimal consensus, and then approximate solutions are investigated as 
e-optimal consensus. Section 5 is on time-varying graphs, and we show optimal consensus under 
uniformly jointly strongly connected graphs. Finally, in Section 6 some concluding remarks are 
given. 

2 Preliminaries 

In this section, we introduce some notations and provide preliminary results that will be used 
in the rest of the paper. 

2.1 Directed Graphs 

A directed graph (digraph) Q = (V, £) consists of a finite set V of nodes and an arc set 8^ where 
an arc is an ordered pair of distinct nodes of V [S]. An element (i, j) € £ describes an arc which 
leaves i and enters j. A walk in Q is an alternating sequence W : ii 61^262 . . . em-iim of nodes 
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ifi; and arcs = (iftjift+i) G £■ for k = 1, 2, . . . , m — 1. A walk is called a path if the nodes 
of the walk are distinct, and a path from i to j is denoted as i ^ j. Q is said to be strongly 
connected if it contains path i ^ j and j ^ i for every pair of nodes i and j. A digraph Q is 
called bidirectional when for any two nodes i and j, {i,j) £ £ and only if € £■ Ignoring 
the direction of the arcs, the connectivity of a bidirectional digraph is transformed to that of 
the corresponding undirected graph. A time- varying graph is defined as Ga(t) = O^^^ait)) where 
cr : [0, +oo) — t- Q denotes a piecewise constant function, where Q is a finite set containing all 
possible graphs with node set V. Moreover, the joint graph of Ga-{t) iii time interval [ti,t2) with 
h <t2< +00 is denoted as ^([ti,t2)) = ^te[tut2)^{t) = {V,Ut<z[t^^t2)S^{t))- 

2.2 Dini Derivatives 

The upper Dini derivative of a continuous function h : {a,b) —?■ H (— oo < a < b < oo) at t is 
defined as 

„ . , , , , h(t + s) - hit) 
D^h{t) = limsup —. 

When h is continuous on {a,b), h is non-increasing on (a, 6) if and only if D^h[t) < for any 
t € (a, 6). The next result is convenient for the calculation of the Dini derivative [101128]. 

Lemma 1 Let Vi{t, rr) : R x ^ R (z = 1, . . . , n) be and V{t, x) = maxj=i^,..^„ Vi{t, x). If 
I{t) = {i E {l,2,...,n} : V{t,x{t)) = Vi{t, x{t))} is the set of indices where the maximum is 
reached at t, then D~^V{t, x{t)) = maxi^x{t) 

2.3 Limit Sets 

Consider the following autonomous system 

x = f{x), (1) 

where / : R'^ R'^ is a continuous function. Let x{t) be a solution of ([T|) with initial condition 
x(to) = x^. Then JIq C R'^ is called a positively invariant set of ([1]) if, for any to € R and any 
x^ € Qq, we have x{t) € Qq, t > to, along every solution x{t) of 

We call y a (xJ-limit point of x{t) if there exists a sequence {t^} with limfc^oo^fc = oo such 
that 

lim x{tk) = y. 

The set of all w-limit points of x{t) is called the w-limit set of x{t), and is denoted as A"''(a;(t)). 
The following lemma is well-known [9]. 
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Lemma 2 Let x{t) be a solution of m). Then A~^(^x{t)^ is positively invariant. Moreover, if 
x{t) is contained in a compact set, then A+(x(t)) ^ 0. 

2.4 Convex Analysis 

A set K C M,'^ is said to be convex if (1 — A)x + Xy G K whenever x ^ K,y £ K and < A < 1. 
For any set S C IR'^, the intersection of all convex sets containing S is called the convex hull of 
S, denoted by co{S). 

Let K he a closed convex subset in R*^ and denote \x\k = infj/G-ft' l^^ — y\ as the distance 
between x G IR'^ and K, where | • | is the Euclidean norm. There is a unique element Pk{x) G K 
satisfying \x — Pk{x) \ = \x\k associated to any x £ M!^ [5]. The map Pk is called the projector 
onto K. The following lemma holds [5]. 

Lemma 3 (i). {Pk{x) - x,Pk{x) -y) <0, Vy G K. 
(ii). \Pk{x) - PK{y)\ < \x-y\,x,y 

(Hi) l^ll^ is continuously differentiable at x with V|x|^ = 2(x — Pk{x)). 

Let / : E,'^ — )> H be a real-valued function. We call / a convex function if for any x, y G H'^ 
and < A < 1, it holds that /((I - A)x + Ay) < (1 - A)/(x) + Xf{y). The following lemma 
states some well-known properties for convex functions. 

Lemma 4 Let f : R'^ ^ R £ be a convex function. 

(i) . f{x)>f{y) + {x-y,Vf{y)). 

(ii) . Any local minimum is a global minimum, i.e., argmin/ = : SI f{z) = O}. 

3 Problem Definition 
3.1 Objective 

Consider a network with node set V = {1,2,... , A^} modeled in general as a directed graph 
Q = (V, £). A node j is said to be a neighbor of i at time t when there is an arc (j, i) G £, and 
we denote TVj the set of neighbors for node i. 

Node i is associated with a cost function /j : IR"^ ^ IR, m > which is observed by node i 
only. The objective for the network is to cooperatively solve the optimization problem 

minimize YlZi hi^) ^2) 
subject to z G R™. 
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We impose the following assumption on the functions fi,i = 1, . . . , N . 
Al. For alH = 1, . . . , A^, we have (i) fi S C^; (ii) fi is a convex function; (iii) argmin/j 0. 
Problem ([2]) is equivalent with the following problem: 

minimize fi{zi) 

subject to Zi G E'" (3) 

Zl = ■ ■ ■ = ZN- 

From ([3]) we see that consensus algorithms are a natural mean for solving the optimization 
problem ([2]). 

3.2 Information Flow 

The state of node i at time t is denoted as Xi{t) € E,™. We define the information fiow for node 
i as follows. 

• The local optimization information gi{t) node i receives from its objective fi at time t is 
the gradient of fi at its current state, i.e., 

giit) = Vf,{xi{t)). (4) 

• The neighboring information nj(t) node i receives from its neighbors at time t is 

m (t) = hi{xi{t),xj{t) : j eAfi), (5) 

where hi : R"" x R"^!-^'! is a continuous function, |7Vj| denotes the number of elements 

in Afi, and / is a given integer indicating the dimension of the neighboring information. 

Let h = hi ■■■ Hn ■■ E™(i+l-^il) X • • • X ]R"^{i+|Arjv|) ^ ^Ni denote the direct sum of 
hi,i = 1, . . . ,N. Then h represents the rule of all neighboring information flow over the whole 
network. We impose the following assumption. 

A2. h e ^ = (g) • • • (g) /iat: hi'. E™(^+l-^'l) i-^ E' and hi = within the local consensus 
manifold ^Xi = Xj : j € N'^} for aU i € V}. 

Remark 1 Assumption A2 is to say that the neighboring information a node receives from its 
neighbors becomes trivial when the node is in the same state as all its neighbors. This is a 
quite natural assumption in the literature on distributed averaging and optimization algorithms 
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3.3 Computational Capacity 

We adopt a dynamical system model to define the way nodes update their respective states. 
The evolution of the nodes' states is restricted to be a first-order integrator: 

Xi = Ui, i = l,...,N, (6) 

where the right-hand side Ui is interpreted as a control input and the control law is characterized 
as 

Ui = j(ni,gi), i = 1, . . . ,N (7) 

with : X ^ R™. 

For the control law J', we impose the following assumption. 

A3. J £'^= •) G CO : X ^ R™, J-(0, •) is injective}. 

Remark 2 Assumption A3 indicates that the control law applied in each node should have 
the same structure, irrespectively of individual local optimization information or neighboring 
information. Note that our network model is homogeneous because one cannot tell the difference 
from one node to another. We assume that the control law J{0, ■) is injective, so each node 
takes different response to different gradient information on the local consensus manifold. Again, 
Assumption A3 is widely applied in the literature J2^ {3^ \T1\ EE IMi- 

3.4 Problem 

Let x{t) = {xf{t), . . . ,xjf{t))'^ € R"*^ be the trajectory of system ([B]) with control law ([7]) for 
initial condition x^ = x(to)- Denote F{z) = J2iLi fi{^)- We introduce the following definition. 

Definition 1 Global optimal consensus of is achieved if for all x^ € R"^^, we have 

limsupF(xj(t)) = min F{z) (8) 



and 



\\m \xi{t)-Xj{t)\=Q, i,j = l,...,N. (9) 



The problem considered in this paper is to characterize conditions on the control law J' 
under which global optimal consensus is achieved. In Section 4 this is done for fixed graphs and 
in Section 5 for time- varying graphs. 



4 Fixed Graphs 

In this section, we consider the possibihty of solving optimal consensus using control law d?]) 
under fixed communication graphs. We first discuss whether exact optimal consensus can be 
reached for directed graphs. Then we show the existence of an approximate solution for optimal 
consensus over bidirectional graphs. 

4.1 Exact Solution 

We make an assumption on the solution set oi F = X^^^ fi- 
A4. argminF(z) 7^ is a bounded set. 

The main result on the existence of a control law solving optimal consensus is stated as 
follows. 

Theorem 1 Assume that Al and A4 hold. Let the communication graph Q he fixed and strongly 
connected. There exist a neighboring information rule and a control law J such that 

global optimal consensus is achieved if and only if 

N 

Pargmin/i(z) /0. (10) 
i=l 

Remark 3 According to Theorem {1\ the optimal solution sets of fi, i = 1,...,N, having 
nonempty intersection is a critical condition for the existence of a control law ^ that solves 
the optimal consensus problem. Condition U0\} is obviously a strong constraint which in general 
does not hold. Therefore, basically Theorem [7] suggests that exact solution of optimal consensus 
is seldom possible for the given model. 

Remark 4 It follows from the proof below that the necessity statement of Theorem{I\ relies only 
on the fact that the limit set of an autonomous system is invariant. It is straightforward to 
verify that for a discrete-time autonomous dynamical system defined by 



with f a continuous function, its limit set is invariant. Therefore, if we consider a model with 
discrete-time update as 



Vk+i = fiVk) 
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with 

Ui{k)=j{ni{k),gi{k)), (13) 

where rii, gi, and J agree with the definitions above, the necessity statement of Theorem {1\ still 
holds. However, the sufficiency statement of Theorem{l\may in general not hold for discrete-time 
updates since even for the centralized optimization problem, there is not always an algorithm with 
constant step size which can solve the problem exactly, cf., f^. 

Remark 5 In a discrete-time algorithm was provided for solving where the structure 
of the nodes' update is the sum of a consensus term averaging the neighbors' states, and a 
subgradient term of the local objective function with a fixed step size. It is easy to see that the 
algorithm in ]3l^j can be rewritten as and il'^) as long as the graph is fixed and the step size 
is constant. All the properties we impose on the information flow and update dynamics are kept. 
Convergence bounds were established for the case with constant step size in ]32i . Theorem [7] 
shows that proposing a convergence bound is in general the best we can do for algorithms like the 
one developed in I32j, and the result also explains why a time-varying step size may be necessary 
in distributed optimization algorithms, as in ]33^ . 

In the rest of this subsection, we first give the proof of the necessity claim of Theorem [H and 
then we present a simple proof for the sufficiency part with bidirectional graphs. The sufficiency 
part of Theorem [1] in fact follows from the upcoming conclusion, Theorem [U which does not 
rely on Assumption A4. 

4.1.1 Necessity 

We now prove the necessity statement in Theorem [1] by a contradiction argument. Suppose 
{^^^i aigmin fi{z) = and there exists a distributed control in the form of d?]), say Jaijii^gi), 
under which global optimal consensus is reached for certain neighboring information flow rij 
satisfying Assumption A2. Let x{t) be a trajectory of system ([6]) with control J^oi^'ij 9i) a-nd 
A'^{x{t)) be its w-limit set. The definition of optimal consensus leads to that x{t) converges to 
the bounded set ( argminF(2;) j H where ( argminF(2:) j denotes the A^'th power set of 
SLrgmmF{z) and A4 denotes the consensus manifold, defined by 

M = {x = {xl ...xjff : xi = ••• = XTv; e ]R™,i = 1,... ,iV}. (14) 
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Therefore, each trajectory x{t) is contained in a compact set. 
Based on Lemma [21 we conclude that A'''(x(t)) 7^ and 

/ \ N 

A+(x(t)) C f argminF(z)j f]M, (15) 

Moreover, A''" {x{t)) is positively invariant since system ([6|) is autonomous under control Jo {ni, gi) 
when the communication graph is fixed. This is to say, any trajectory of system ^ under control 
Jo{ni,gi) must stay within A"'"(x(t)) for any initial value in K'^{x{t)). 

Now we take y € A''"(x(t)). Then we have y € I argminF(z) I W-M. according to (|15p. and 
thus y = (zj" . . . z^Y' for some € argminF(2;). With Assumption Al, the convexity of the 
/j's implies that 

AT 

argminF(z) = {z G : ^V/i(z) = 0}. (16) 

i=l 

On the other hand, we have 

AT N 



fl argmin/,(z) = f| {z G R'" : V/i(z) = O} = 0. 

Therefore, there exists two indices ii, ^2 £ {1) • • • > A^} with ii 7^ 12 such that 

V/i,(z,)/V/,,(z.). (17) 

Consider the solution of (0) under control Joint, Qi) for initial time to and initial value y. 
The fact that y belongs to the consensus manifold guarantees 

ni,{to) = Hi^ito) = 0. (18) 

With Assumption A4, we have 

JoK(to),5ii(to)) = Jo(0,V/,,(z,)) / Jo(0,V/,,(z,)) = Jo K (to), 5*2(^0)). (19) 

This implies Xji(to) 7^ Xi2(to)- As a result, there exists a constant e > such that Xj^(t) 7^ Xi^{t) 
for t € (to> ^0 + £)• In other word, the trajectory will leave the set 

^argminF(z)^ fl"^ 

for {to, to + e), and therefore will also leave the set A'^{x{t)). This contradicts the fact that 
A~^{x{t)) is positively invariant. The necessity part of Theorem [1] has been proved. 
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4.1.2 Sufficiency: Bidirectional Case 

We now provide an alternative proof of sufficiency for bidirectional graphs, which is based on 
some geometrical intuition of the vector field. Note that compared to the proof of Theorem H] on 
directed graphs, this proof uses completely different arguments which indeed cannot be applied 
to directed graphs. Therefore, we believe the proof given in the following is interesting at its 
own right, because it reveals some fundamental difference between directed and bidirectional 
graphs. 

Let Uij > be a constant marking the weight of arc {j, i). We will show that the particular 
neighboring information flow 

Hi — ^ ^ O^ij (^Xj Xij 

and control law 

J*ini,gi) =ni- gi= ^ aij{xj - Xi) - S/fi{xi) (20) 



ensure global optimal consensus for system ([6]). Note that (|20j) is indeed a continuous-time 
version of the algorithm proposed in |32j . 

We suppose Q is bidirectional. In this case, we have aij = aji for all i and j, and we use 
unordered pair {i,j} to denote the edge between node i and j. 

Noticing that 



J*ini,gi) = ^ aij{xj - Xi) - V fi{xi) = -V^Y^ ^ aij\xj - Xi\^ + fi{xi) 



^2 



f211 



(22) 



we have that ([20]) indeed solves the following convex problem 

minimize Fg{x) = YliLi fii^i) + | Zlfejef % ~ ^^1^ 
subject to Xi € R™", i = 1, . . . , N. 

We establish the following lemma relating the solution sets of problems ([2]) and (|22p . 

Lemma 5 Suppose f]f^^ arg mm fi{z) ^ 0. Suppose also the communication graph Q is fixed, 
bidirectional, and connected. Then we have 

N 

/ \N / \N 

argminFg(x) = i arg min /j(z) j ~ ( argminF(z)j P)-^- (23) 



i=l 
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Proof. When fli^i ^^E.^^^ fi{^) 7^ 0) it is straightforward to see that 

AT 



arg min ^(z) = |^ arg min /^(z). 



1=1 



Now take x* = {p^ ...V^V ^ f flili argmin/i(2;)j fl-^) where G f|ili argmm/j(z). 
First we have x^, G arg min^; ^^^^ Second we have G arg min^;. ^ ^|^- .-l^^. aylxj — 
Therefore, we conclude that x^, G argminFg(2;). This gives 

argminFg(x) ^ f P| argmin/j(2;) j P)-^- (24) 

i=l 

On the other hand, convexity gives 

argminFg(x) = |x: -(L ® /r„)x = ((V/i(xi))^ . . . (V/iv(x7v))^)^| , (25) 

where (8) represents the Kronecker product, Im is the identity matrix in R"*, and L = D — ^ is 
the Laplacian of the graph Q with A = [uij] and D = diag((ii, . . . , dj\[), where di = Yl^=i '^ij- 
Noticing that 

(1^ (S) Im){L ® Im) = 1%L (E)lm = 0, 
where Iat = (1 . . . 1)"^ G R^, we have 

[l%0lm)[{Vh{xi))'" ...{VfNixN))'') =J^V/,(x,) = (26) 



for any x G arg min Fg(a::). 

Now take x* = {qj . . . qjj)'^ G argminFg(x). Suppose there exist two indices and such 
that 

V/i.(giJy^ V/,,((?,J. 

Then at least one of V fi^{qi^) and V fj^{qj^) must be nonzero. Taking p G Plilzi argmin/j(z), 
we have 

N N 
i=l i=l 

because for x = {xj . . . x^)^ G argmin^^^ fi{xi), we have V fi{xi) = 0, i = 1, . . . , A^. Conse- 
quently, for = {p^ . . .p^y , we have 

Fg{x*) > Fg{w,), 
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which is impossible according to the definition of x* so that such and j* cannot exist. In light 
of (|26p . this immediately implies 



V/i(Qi) = 0, i = l,...,N, 

or equivalently 

€ argmin/i(z), i = 1, . . . ,iV (27) 

for all X* = . . . € argminFg(x). 

Therefore, we conclude from ([27|) that 

N N 

1=1 1=1 

and this implies 

aij\qj - qi\'^ = 

as long as x* = [qf . . . q^Y' ^ argminFg(x). The connectivity of the communication graph thus 

/ \N 

further guarantees that (71 = • • • = qat, so we have proved that x* € ( 0/=! ^-^g fii^)) fl 
Consequently, we obtain 

argminFg(2;) C i P| argmin/j(z) j P)-^- (28) 

i=l 

The desired lemma holds from (|24p and (|28p . □ 
Now since Fg{x) is a convex function and we have x = VFg{x) for system ([6]) with control 
(PU|) . we conclude that 

lim dist(x(t), argmini<g(x)) = 0. 

t~^oo 

Lemma [5] ensures 

^lim distf x(t), Pi argmin/j(z)^ fl-^) 
if Q is bidirectional and connected. Equivalently, global optimal consensus is reached. 

Remark 6 We see from the 'proof above that the construction of Fg{x) is critical because the 
convergence argument is based on the fact that the gradient of Fg[x) is consistent with the 
communication graph. It can be easily verified that finding such a function is in general impossible 
for directed graphs. 
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4.2 Approximate Solution 

Theorem [1] indicates that optimal consensus is impossible no matter how the control law J 
is chosen from as long as the nonempty intersection condition (jlOp is not fulfilled. In this 
subsection, we discuss the approximate solution of the optimal consensus problem in the absence 
of (jlOp . We introduce the following definition. 

Definition 2 Global e-optimal consensus is achieved if for all G , we have 

limsupF(xj(t)) < min F{z) + e (29) 



and 



lim \xi{t) - Xj{t)\ < e, i,j = 1,...,N. (30) 



Denoting Fg (x; K) = X^^j^ fi (xj) + ^^^^ aij \xj — Xi\ , we impose the following assump- 

tion. 

A5. (i) argminF(2;) ^ 0; (ii) arg min Fg(3;; i^T) 7^ for all K > 0; (iii) |J^->q argminFg(a;; i^) 
is bounded. 

For e-optimal consensus, we present the following result. 

Theorem 2 Assume that Al and A 5 hold. Let the communication graph Q he fixed, bidirec- 
tional, and connected. Then for any e > 0, there exist a neighboring information rule h ^ M 
and a control law J such that global e-optimal consensus is achieved. 

Proof. Again, let Oij > be any constant marking the weight of arc (j, i) and Oij = aji for all 
{i,j) G £■ Fix e. We will show that under neighboring information flow 

there exists a constant Kg > such that the control law 

Ui = JKAniiQi) = K^Ui - gi (31) 

guarantees global e-optimal consensus. 
It is straightforward to see that 

JK{ni,gi) = 0'ij{xj - Xi) - V fi{xi) = -^x,{^ ^ - + fi{xi)j- (32) 
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System ([6]) with control law Ui = J^Ki^iygi) can be written into the following compact form 



x = -VFg{x;K), x = {xj ...x%f gR"^^. (33) 

Then the convexity of Fg{x;K) ensures that control law Jxini^gi) asymptotically solves the 
convex optimization problem 

minimize Fg{x; K) = J^^i fi{xi) + f E{j,i}e£ " ^3^^ 

subject to Xi G R™, i = 1, . . . ,N. 

Convexity gives 

argminFg(x;/0 = |x: -i^(L C5 = ((V/i(xi))^ . . . (V/^(x7v))^)^}. (35) 
Under Assumptions Al and A5, we have that 

Lo = sup ||V-F(rE)| X £ \^ argmin Fg{x;K)] (36) 

is a finite number, where F{x) = '}2!i=i fii^i)- We also define 

I?o = sup jjz* — Xj| : i = l,...,A^, x£ |^ argmin Fg(j;; K)|, (37) 

A'>0 

where S argmin F is an arbitrarily chosen point. 

Let p = ...p^y € argminFg(a;; A') with pi G W^,i = 1,...,A^. Since the graph is 
bidirectional and connected, we can sort the eigenvalues of the Laplacian L ® 1^ as 

= Al = • • • = Am < Am+l < • • • < AmAT. 

Let Zi . . . , ImN be the orthonormal basis of formed by the right eigenvectors of L ® Im, 

where li, . . . ,lrn are eigenvectors corresponding to the zero eigenvalue. Suppose p = YlT=i '^kh 
with Cfc G R, /c = 1, . . . , mN. 
According to ()35p . we have 

2 mN 2 rriAf 

K{L®l^)p =K^\ CkXkh =K^ 4>'l<Ll, (38) 



which yields 



fc=m+l k=m+l 



mN ^ r, 



k=m+l ^ 
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where A2 > denotes the second smallest eigenvalue of L. 
Now recall that 



M = {x = {xj . . . xjj)'^ : xi 



xn; Xi £ = 1, . . . ,iV}. 



(40) 



is the consensus manifold. Noticing that A4 = span{/i, . . . ,lrn}, we conclude from ([5^ that 

mN rnN „ N 



< 



KX* 



(41) 



k=m+l k=m+l i=l 

The last equality in (jlT]) is due to the fact that In ( "^'n^ ^' ) projection of p on to 

Thus, for any <^ > 0, there is i^i(?) > such that when K > Ki{<^), 



Pi Pave 



< i = l,...,N 



and 



\F{pi) - F{p, 



! • • • ! M 



(42) 



(43) 



N • 



where pav 

On the other hand, with ()35p . we have 

AT TV 
E "^fi^Pi^ = E V/i(Pavc + Pi) = 0, 



(44) 



1=1 i=l 

where Pi = Pi — Pavc- Now according to (P2]l and (jM]), since each fi G C^, for any ? > 0, there 
is -K^2(?) > such that when K > K2{<;), 

N 



Y.^fi{p. 



2=1 



This implies 



< 



N 







F{pavc) < F{z^) + |z* - Pavel X E V/i(pavc) < -^(2^*) + ? 



i=l 



(45) 



(46) 



Therefore, for any e > 0, we can take Kq = max{i^i(e/2), i^2(e/2)}. Then when K > Kq, 
we have 



\Pi-Pj\<e; F{pi) <mmF{z) + e 



(47) 



for all i and j. Now that Fg{x; K) is a convex function and observing (jSSp . every limit point of 
system ([6]) with control law Jxi'^iigi) is contained in the set argminFg(x; K). Noting that p is 
arbitrarily chosen from argminFg(x; ii'), e-optimal consensus is achieved as long as we choose 
Kt ^ Kq. This completes the proof. □ 
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Remark 7 Theorem [H can be compared to the results given in 140^ , where a discrete-time in- 
cremental algorithm with constant step size was shown to be able to reach an e-approximate 
solution of Incremental algorithms relies on global iteration along each local objective func- 
tion alternatively 14 C\ \4S\ \43^ - They are therefore fundamentally different with the model we 
discuss. 

Remark 8 For the discrete-time algorithm proposed in fSEI, o. bound of the convergence error 
was expressed explicitly as a function of the fixed step size. However, this bound will not vanish 
as the fixed step size tends to zero or infinity I32i. Note that the parameter K in the control law 
J^K{ni,gi) can be viewed as a step size. As shown in Theorem\^ the convergence error vanishes 
as K tends to infinity, which is essentially different with the discrete-time case in ]32^ . 

Prom Theorems [T] and [21 we conclude that even though without the nonempty intersection 
condition (|10p . it is impossible to reach exact optimal consensus via control law of the form of 
([7]), it is still possible to find a control law that guarantees approximate optimal consensus with 
arbitrary accuracy. 

4.3 Discussion: Global vs. Local 

A fundamental question in distributed optimization is whether global optimization can be ob- 
tained by neighboring information flow and cooperative computation. We have the following 
observation. 

• Note that in this paper, to determine a proper K in (j31|) for a given e relies on knowledge 
of the structure of the network, and the information of all fi,i = l,...,iV. Finding a 
proper control law for e-optimal consensus requires thus global knowledge of the network. 
Apparently also the nonempty intersection condition in Theorem [T] is a global constraint. 

• Incremental algorithms with constant step size have been shown to be able to reach e- 
optimal solution for any error bound e as long as the step size is sufficiently small, e.g., |401 
W\\ I43j . In an incremental algorithm, iteration is carried out by only one node alternatively 
on each local objective function, which is is equivalent to the fact that the N nodes perform 
the iteration, but any node can access the states of all other nodes. Therefore, it means 
that the underlying graph is indeed complete, which is certainly a global constraint. 
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• One can also use time- varying step size. In [33], it was shown that global optimization can 

be achieved by a algorithm combining consensus algorithm and subgradient computation 
with a time-varying step size. However, this time-varying step size must be applied to all 
nodes homogeneously, which makes it a global parameter. 

From the above observations, we can conclude that in general for distributed optimization 
methods, some global information (or constraint) is somehow inevitable to guarantee a global 
(exact or e-approximate) convergence. This reveals some fundamental limit of distributed infor- 
mation collection and algorithm design. 

4.4 Assumption Feasibility 

This subsection discusses the feasibility of Assumptions A4 and A5 and shows that some mild 
conditions are enough to ensure A4 and A5. 

Proposition 1 Let Al hold. If F[x) = "^i^i fi{xi) is coercive, i.e., F{x) — )• oo as long as 
\x\ — )• oo, then A4 and A5 hold. 

Proof. Assume that Al holds. 

a) . Since F{x) = J2iLi fii^i) is coercive, it follows straightforwardly that F{z) = fii^) 
is also coercive. As a result, argminF(2;) ^ is a bounded set. Thus, A4 and A5.(i) hold. 

b) . Observing that ^^{j i}e£^^j\^J — Xj]^ > for all x = {xj . . . x]^)'^ G R™-^ and that 
^i^) — Si^i fii^i) is coercive, we obtain that argmini<g(x; K) ^ for all K > 0. Thus, A5.(ii) 
holds. 

c) . Based on a), we can denote F^ = min^ F(z) = F{z^,). Since X^^i fi{xi) is coercive, there 
exists a constant M{F.^,) > such that X^^Li fii^i) > F^, for all \x\ > M. This implies 



for all \x\ > AL That is to say, the global minimum of Fg{x\K) is reached within the set 
{|x| < M} for all > 0. Therefore, we have 



Fg{x;K) > Fg{lN(^z,-K) = F, 



(48) 




(49) 



K>0 



This proves A5.(iii). 

Next, we propose another case when A4 and A5 hold. 



□ 
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Proposition 2 Let Al hold. Suppose each argmin/j is bounded and the argument space for 
each fi is M., i.e., m = 1. Then A4 and A5 holds. 

Proof. Assume that Al holds. 

a) . Let X* G arg min /j. Denote = min{x|, . . . , x^}. Then for any i = 1, . . . , A'", we have 

> fi{x*) - fi{y,) > {x* - y,)Vf,{y.) (50) 

according to inequahty (i) of Lemma|H This immediately yields V fi{y^) < for all i = 1, . . . , A^. 
Thus, for any y < y^, we have 

TV 

F(y) - F(y,) > {y - y,)VF{y,) = J^(y - y.)V fi{y,) > 0, (51) 

1=1 

which implies F{y) > F{y^:) for all y < y^. 

A symmetric analysis leads to that F{y) > F{y*) for all y > y* with y* = maxjx^, . . . , 
Therefore, we obtain F(y) > mm{F{y^:), F{y*)} for all y ^ [y*,y*]- This implies that a global 
minimum is reached within the interval [y*,y*] = co{x^, . . . ,x*^} and A5.(i) thus follows. 

If arg min /i is bounded for i = 1,...,N, there exist bi < di,i = 1,...,N such that 
arg min /j = [bi,di]. Define 6* = min{6i, . . . , ^^r} and d* = max{di, . . . ,d]\f}. Following a 
similar argument we have arg min F C [b^:,d*]. Thus A4 holds. 

b) . Introduce the following cube in H^: 

= |x = (xf . ..x%)'^ : Xi G [y^ - 7],y* + r]],i = 1, . . . , A^j, 
where ?] > is a given constant. 

Claim. For any > 0, is an invariant set of system ^ under control law Jxini, Qi). 
Define ^(x(t)) = maxjgv ^^^(i). Then based on Lemma [U we have 

D'^'^(x(t)) = max ^xAt) 
ieXo(t) dt 



< max \-Vfi{xi)\, (52) 

where 2^o(t) denotes the index set which contains all the nodes reaching the maximum for ^{x{t)). 
Since 

> fi{x*) - f^{y. + 7?) > {x* -y,- r,)Vfi{y. + r,), i = I, . . . ,N (53) 
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we have V/j(y* + r]) >0 for alH = 1, . . . , iV. As a result, we obtain 

D+^{x{t)) <0, (54) 

which imphes ^{x{t)) <y* + i] for ah t > to under initial condition ^(x{tQ)) < y* + ij. Similar 
analysis ensures that minjgy Xi{t) > y* — rj for all t >to as long as minjgv Xi{tQ) > y* — rj. This 
proves the claim. 

Note that every trajectory of system ([6]) under control law JK{ni,gi) asymptotically solves 
()34p . This immediately leads to that Fg{x;K) reaches its minimum within C2 for any K > 
since is an invariant set. Then A5.(ii) holds straightforwardly. 

c). Since arg min /j is bounded for i = 1, . . . ,N, there exist 6j < di,i = 1, . . . ,N such that 
argmin/j = di]. Define b^, = min{6i, . . . , 6jv} and d* = max{di, . . . , djv}. We will prove the 
conclusion by showing argminFg(x; K) C C^, for all K >0, where 



|x = {xj ...x]^f : X, G %,d*],i = l,...,iv}. 



Let z = (zi . . . , zj^Y' ^ argmini<g(x; K). First we show max{zi, . . . , z^v} < d* by a contra- 
diction argument. Suppose max{zi, . . . , zn} > d*. 

Now let ii,...,ik be the nodes reaching the maximum state, i.e., Zi-^ = ■■■ = Zij_ = 
max{zi, . . . , zat}. There will be two cases. 

• Let k = N. We have zi = • • • = zat = y in this case. Then for all i and x* G arg min /j, 
we have 

> Mx*) - My) > {x* - y)Vh{y) (55) 
which yields S/fi{y) > 0,i = 1, . . . , N since y > d*. This immediately leads to 

Fg{z;K) = F{y) > minF > min Fg{z; K), (56) 
which contradicts the fact that z G argmini<g(x; X). 

• Let k < N. Then we denote = maxjzj : i ^ {ii, . . . = 1,...,A^|, which is 
actually the second largest value in {zi, . . . , zm}- We define a new point i = (li . . . , zn)'^ 
by Zi = Zi,i^ and 



Zi 



d*, if < d* 

(57) 

s*, otherwise 
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for i € {ii, . . . Then it is easy to obtain that Fg{z;K) > Fg{z;K), which again 

contradicts the choice of z. 

Therefore, we have proved that max{zi, . . . , zn} < d* . Based on a symmetric analysis we also 
have mm{zi, . . . , zn} > Therefore, we obtain argmini<g(a;; i^) C C* for all K > and 
A5.(iii) follows. □ 



5 Time-varying Graphs 

Now we consider time- varying graphs. The communication in the multi-agent network is modeled 
as Ga{t) = 0^j^a{t)) with a : [0, -|-oo) — )• Q being a piecewise constant function, where Q is a 
finite set indicating all possible graphs. In this case the neighbor set for each node is time- 
varying, and we let Mi{a{t)) represent the set of agent z's neighbors at time t. As usual in the 
literature [201 [281 [25] , an assumption is given to how fast Ga{t) can vary. 

A6. (Dwell Time) There is a lower bound td > between two consecutive switching time 
instants of (j{t). 

We have the following definition. 

Definition 3 (i) Qa{t) •'''^'i'd to be uniformly jointly strongly connected if there exists a constant 
T > such that G{[t, t + T)) is strongly connected for any t >0. 

(a) Ga{t) 'is s'^'id to be uniformly jointly quasi- strongly connected if there exists a constant 
T > such that Q{[t, t + T)) has a spanning tree for any t > 0. 

With time-varying graphs, 

mit) = hi{xi{t),Xj{t) :j £Mi{a{t))). (58) 

where hi : R*" x ]R,"^I-'^'(°"(*))I — )• is now piecewise defined. As a result, assumption A2 is 
transformed to the following piecewise version. 

A7. h£ = (g) • • • (g) hjy: hi maps E™(i+I^'('^(*))l) to H' on each time interval when a{t) is 
constant, and hi = within the time-varying local consensus manifold {xj = xj : j G J\fi{a{t))^ 
for ah z G v|. 

For optimal consensus with time- varying graphs, we present the following result. 
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Theorem 3 Suppose Al and A6 hold and Qa{t) uniformly jointly strongly connected. Sup- 
pose PliLi ft ¥^ ^ contains at least one interior point. Then there exist a neighboring 
information rule h € and a control law J such that global optimal consensus is achieved 
and 

lim Xi{t) = x^. (59) 

for some € Hi^i argmin/j. 

Note that ()59p is indeed a stronger conclusion than our definition of optimal consensus as 
Theorem [3] guarantees that all the node states converge to a common point in the global solution 
set of F{z). We will see from the proof of Theorem [3] that this state convergence highly relies 
on the existence of an interior point of Hi^i argmin/j. In the absence of such an interior point 
condition, it turns out that optimal consensus still stands. We present another theorem stating 
the fact. 

Theorem 4 Suppose Al and A6 hold and Qa{t) uniformly jointly strongly connected. Suppose 
also Hilzi argmin/j 7^ 0. Then there exist a neighboring information rule h € ^* and a control 
law J' £ ^ such that global optimal consensus is achieved. 

The proofs of Theorems [3] and U] rely on the following neighboring information flow 

nj = ^ aij{t)[xj - Xi), (60) 

j£Afi{cT{t)) 

where aij (t) > is any weight function associated with arc (j, i) . The resulting control law is 

J*{ni,gi) = Ui - gi. (61) 
An assumption is made on each aij{t),i,j = 1, 2, N. 

A8. (Weights Rule) (i) Each aij{t) is piece-wise continuous and aij(t) > for all i and j. 
(ii). There are a* > and a* > such that a* < ajj(t) < a* , t £ 1R+. 

5.1 Preliminary Lemmas 

We establish three useful lemmas in this subsection. 

Suppose Plj^i argmin/j ^ and take G CliLi argmin/j. We define 

Viit) = \xi{t)-z,\\ i = l,...,N, (62) 
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and 

V{t) = max V^{t). (63) 

i=l,...,N 

The following lemma holds with the proof in Appendix A.l. 

Lemma 6 Let Al and A8 hold. Suppose P|^^argmin/j ^ 0. Then along any trajectory of 
system ^ with neighboring information i60\) and control law J'i,{ni,gi), we have D^V{t) < 
for all t € 1R+. 

A direct consequence of Lemma [6] is that when Hi^i argmin/^ ^ 0, we have 

lim V{t) = dl (64) 

t— >oo 

for some > along any trajectory of system ([6]) with control law J'^,{ni,gi). However, it is 
still unclear whether Vi{t) converges or not. We establish another lemma indicating that with 
proper connectivity condition for the communication graph, all Vi{t)^s have the same limit d^. 
The proof can be found in Appendix A. 2. 

Lemma 7 Let Al, A6, and A8 hold. Suppose Hi^i argmin/j ^ and Qa(t) uniformly jointly 
strongly connected. Then along any trajectory of system (0) with neighboring information Ii60\) 
and control law J'i,{ni,gi), we have limt_>oo ^(0 = for all i. 

The next lemma shows that each node will reach its own optimum along the trajectories of 
system ([6|) under control law J'-^{ni,gi). The proof is in Appendix A. 3. 

Lemma 8 Let Al, A6, and A8 hold. Suppose Hi^i argmin/j ^ and Qa{t) uniformly jointly 
strongly connected. Then along any trajectory of system ^ with control law J'^{ni,gi), we have 

limsupj^^^ ki(*)largmm/. = ^ J^"'^ 

5.2 Proof of Theorem g] 

The proof of Theorem [3] relies on the following lemma. 

Lemma 9 Let zi, . . . , z^+i € R™ and di,...,dm+i G Suppose there exist solutions to 

equations (with variable y) 

\y - Zip = di; 

: (65) 

\y — Zm+l\^ = C?-m+l- 

Then the solution is unique z/rank(2;2 — zi, . . . , Zm+i — zij = m. 
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Proof. Take j > 1 and let y be a solution to the equations. Noticing that 

{y - zi,y - zi) = di; {y - Zj,y - zj) = dj 

we obtain 

{y,Zj - zi) = ^(di- dj + \zj\^ - kip), i = 2,. . . ,m + 1. (66) 

The desired conclusion follows immediately. □ 

We now prove Theorem [3j Let = {rf . . . r'j^)'^ be a limit point of a trajectory of system 
([6]) with control law J'^,{ni,gi). 

We first show consensus. Based on Lemma [71 we have limj_!.oo ^(i) = c^* for all G 
Pl^;^ argmin/j. This is to say, |rj — = for all i and G Q^-^ arg min /j. Since 
Pl^^argmin/j 7^ contains at least one interior point, it is obvious to see that we can find 
zi,..., Zm+i G flili argmin/i with rank(z2 - zi, . . . , Zm+i - zi) = m and di,..., dm+i G K-^, 
such that each rj, i = 1, . . . , is a solution of equations ()65p . Then based on Lemma [9l we 
conclude that ri = • • • = rj^. Next, with Lemma El we have Ij'ilargmin/i = 0- This implies that 
ri = • • • = ttv G C\i=i arginin/i, i.e., optimal consensus is achieved. 

We turn to state convergence. We only need to show that is unique along any trajectory 
of system ^ with neighboring information (|6Up and control law J'i,{ni,gi). Now suppose r| = 
Iat (8) and = I7V 'S' are two different limit points with 7^ G arg min 

According to the definition of a limit point, we have that for any e > 0, there exists a time 
instant such that — < e for all i. Note that Lemma [U] indicates that the disc 

B{r^,e) = {y : \y — < e} is an invariant set for initial time t^. While taking e =\r^ — r^|/4, 
we see that ^ B{r^, \r^ — r^|/4). Thus, cannot be a limit point. 

Now since the limit point is unique, we denote it as Iat (Six* with x* G fli^i argmin/^. Then 
we have limf_>.oo Xj(t) = for alH = 1, . . . , iV. This completes the proof. 

5.3 Proof of Theorem [H 

In this subsection, we prove Theorem HI We need the following lemma on robust consensus, 
which can be found in |27j . 

Lemma 10 Consider a network with node set V = {1, . . . , A^} with time-varying communication 
graph Ga(t) ■ Let the dynamics of node i be 

Xi= ^ aij{t)[xj - Xi) +Wi{t), (67) 
jeA/'i(<T{t)) 
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where Wi{t) is a piecewise continuous function. Suppose A 6 and A8 hold and Q(j{t) uniformly 
jointly quasi- strongly connected. Then we have 

lim \xi{t) - x-j{t)\=d, i,j = l,...,N (68) 

if\mit^ooWi{t) = for all i. 

Lemma [8] indicates that limsupj_j.f^ l'^*(^)largmin/ ~ ^ ^' ^^i^'^ yields 

lim V/i(x,(t)) =0 (69) 

for all i according to Assumption Al. Then the consensus part in the definition of optimal 
consensus follows immediately from Lemma [TOl Again by Lemma [HI we further conclude that 
limsupj_^oo dist(2;j(t), Pl^j^ arg min /j) = 0. The desired conclusion thus follows. 

6 Conclusions 

Various algorithms have been proposed in the literature for the distributed minimization of 
Si^i fi with fi only known to node i. This paper explored some fundamental properties for 
distributed methods given a certain level of node knowledge, computational capacity, and infor- 
mation flow. It was proven that there exists a control law that ensures global optimal consensus 
if and only if argmin/j,i = 1,...,A^, admit a nonempty intersection set for fixed strongly 
connected graphs. We also showed that for any error bound, we can find a control law which 
guarantees global optimal consensus within this bound for fixed, bidirectional, and connected 
graphs under some mild conditions such as that fi is coercive for some i. For time- varying 
graphs, it was proven that optimal consensus can always be achieved as long as the graph is 
uniformly jointly strongly connected and the nonempty intersection condition holds. It was then 
concluded that nonempty intersection for the local optimal solution sets is a critical condition 
for distributed optimization using consensus processing. 

More challenges lie in exploring the corresponding limit of performance for high-order schemes, 
the optimal structure of the underlying communication graph for distributed optimization, and 
the fundamental communication complexity required for global convergence. 
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Appendix 

A.l Proof of Lemma [HI 

Based on Lemma [TJ we have 



D+V(t) = max ^Vi(t) 



max 2(xi{t) - z^, ^ aij{t){xj - Xi) - V fi{xi) \, (70) 



where denotes the index set which contains all the nodes reaching the maximum for V{t). 
Let m € 1{t). Denote 



Zt = {z: \z-z,\ < VW)} 



as the disk centered at with radius y^V{t). Take y = Xm(t) + {xm(t) — z*). Then from some 
simple Euclidean geometry it is obvious to see that Pztiu) = Xm(t), where Pzt is the projector 
onto Zt. Thus, for all j € Afm{cr{t)), we obtain 

(Xmit) - Z^,Xj(t) - Xm{t)) = {v - Xm{t),Xj{t) - Xm{t)) 

= {y-PzM,xj{t)-PzAy)) 

< (71) 

according to inequality (i) in Lemma[3] since Xj{t) € Zt. On the other hand, based on inequality 
(i) in Lemma m we also have 

{x„^{t) - Z,,-Vfm{xm{t))) < fm{z*) " fm{xm{t)) < (72) 

in light of the definition of z^. 

With dZni), dZI]) and ([72]), we conclude that 



D+V{t) = max 2(xi(t) - z,, ^ aijit){xj - Xi) - V fi{xi)) < 0, (73) 

ieM(a(t)) 

which completes the proof. □ 
A. 2 Proof of Lemma [7] 

In order to prove the desired conclusion, we just need to show liminf^^oo Viit) = for all i. 
With Lemma[6l we conclude that Ve > 0, 3M{e) > 0, s.t.. 



VW) <d, + e (74) 
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for all i and t > M. 

Claim. For all t> M and all i,j € V, we have 



{xi{t)- z^,Xj{t)-Xi{t)) < -Vi{t) + {d^+e)y^Vi{t)- (75) 

If Xi (t) = ([75]) follows trivially from ([71|) . Otherwise we take = + (d* + e) ||^'^*|~^* 
and Bt = {z : \z — z^\ < + e}. Here Bt is the disk centered at with radius d^: + e, and 
2/* is a point within the boundary of Bt and falls the same line with z^, and XiQ{t). Take also 
= y* + Xi{t) — z*. Then we have 

{xi{t) - z^,Xj{t) - y*) = (g* - y*,Xj{t) - y^) 

= {q* - PBt{q*),Xj{t) - PbAq*)) 

< (76) 

according to inequality (i) in Lemma [3l which leads to 

{xi{t) - Z:,,Xj{t) - Xi{t)) = {xi{t) - z^,Xj{t) - y*) + {xi{t) - Z:,,y:, - Xi{t)) 

< {xi{t) - z^,y^ - Xi{t)) 



= -V^{t) + {d,+e),/Vdt). (77) 

This proves the claim. 

Now suppose there exists with liminft_j>oo Vi{t) = < d^. Then we can find a time 

sequence with limfc_j.ootfc = oo such that 

V'^<^^. (78) 

We divide the rest of the proof into three steps. 
Step 1. Take tf^^ > M. We bound Vigit) in this step. 

With the weights rule A8, (j75p and inequality (i) in Lemma [U we see that 

^Vi^{t) = 2(^Xi^{t) - z*, ai^j{t){xj - Xio) - V/io(xi(,(i)) 

jGMo(fT(i)) 



<2 ^ aioj{t)\Xi,^{t) - z*,Xj{t) - Xi„{t)j + fi^-,{z^) - fio{xio{t)) 

< 2(iV - l)a*( - Vi,{t) + {d, + e)^/Vi,{t)y (79) 
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for all t >tko, which implies 

f^VVi^)<-iN-l)a*(^y^V~jr)-{d, + e)), t>tk,. (80) 
In light of Gronwall's inequality, (f78|) and ([80]) yield 

VViJt) < e-^^-')'^'^- ^1^) + (l - e-(^-i)^'^*^-) {d. + e) 

< ^ 0,, + (l ) (d, + e) 

= A.. (81) 

for all t € [tko,tko + {N — 1)Td] with To = T + r/j, where T comes from the definition of 
uniformly jointly strongly connected graphs and represents the dwell time. 
Step 2. Since the graph is uniformly jointly strongly connected, we can find an instant t G 
[^fco'^fco ^^"^ another node i\ € V such that (io,ii) € Ga(t) foi^ * ^ [i,^ + In this step, we 
continue to bound Vi^(t). 

Similar to (j75p . for all t > M and all i,j € V, we also have 

{xiit) - z„xj{t) - Xiit)) < -VWt)[VViit) - ^/v^)) (82) 

when Vj{t) < Vi{t). Then based on ([75]), ([HI]), and ([82]), we obtain 

^^ii(i)<2 ^ ai^j(t)(^Xi^{t) - z^,Xj{t) - Xi^it)^ 

= 2 ^ ai-,j{t)(xi^{t) - z^,Xj{t) - Xii(t)^ + 2aijio(i)^2;j,(t) - z*,Xio(t) - Xji(t) 

< 2(iV - 2)a* ( - V^, (t) + {d, + e)Vy.,(t)) - 2a,y!^(7!^ - v^V^) 

< -2 ((TV - 2)a* + a*)yi,(t) + 2^Ki,(t)((Af - 2)a*{d,+e) + A,a,) (83) 

for t E [t, t + T£i], where without loss of generality we assume Vij^{t) > Vio(t) during all t € 
[i,i + TD]. 

Then §3^ gives 

^\/^*i(t) < -((iV-2)a* + a,)v/l^,,(t) + ((A^ - 2)o*(d, + e) + A,a,) , t G [£, £+ r^] (84) 
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which yields 

V,,{i + Tn) < e-((^-2)«*+-)--(4 +e) + (l- e-{(N-2)a*+a,)ro\ {N - 2)a* {d + s) + A.a. 

\ J [N - 2)a* + a* 



(l _ g-{(Af-2)a*+a.)rD^ g-(iV-l)2a*To 



{N - 2)a* + a* 2 

+ (^- ' (A^-2K+„. ' -2 )<*+^' («^> 

again by Gronwah's inequahty and some simple algebra. 

Next, applying the estimate of node io in step 1 on ii during time interval [t + To, tk^ + — 
1)Td], we arrive at 



VViAt)< 



{N - 2)a* + a* 



. „ _ g-{(Af-2)a*+a.)rD\ p-2(7V-l)2a*TD , 



for ah t G + Tfl, + (iV - l)rfl]. 

Step 3. Noticing that the graph is uniformly jointly strongly connected, the analysis of steps 
1 and 2 can be repeatedly applied to nodes is, ■ ■ ■ ,iN-i, and eventually we have that for all 
io, ■ ■ ■ , iN-l, 



V^^[tl,, + {N-l)TD) < 



-'ho 



{N - 2)a* + a, 

+ ^-( iN-2)a*^a. ) ^ 2 V'* ^ 



< (87) 

for sufficiently small e because Oi^^ < and 

/a*(l -e-((^-2)a*+a*)ro^yv-2 e-(iV-l)3a*rB 
V {N - 2)a* + a* / ^ 2 ^ 

is a constant. This immediately leads to that 

V{tk, + iN -1)Td) <d,, 

which contradicts the definition of d*. 
This completes the proof. 
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A. 3 Proof of Lemma [5] 

With Lemma[71 we have that hmt^oo Vi{t) = dl for all i G V. Thus, Ve > 0, 3M(e) > 0, s.t., 



< \/Vi{t) < + e 

for all i and t > M. If = 0, the desired conclusion follows straightforwardly. Now we suppose 
> 0. 

Assume that there exists a node io satisfying limsup(_^oo l'^*o(^)|argmin/ ^ ^" '^^^'^ 
find a time sequence {tfc}i° with limfc_^oo ^fc = and a constant (5 such that 

\xiJtk)\ ■ r > 5, k = l,.... (90) 

I «0 V 1. / 1 arg mm /ig ~ ' ' ^ ' 

Denote also Si = {z : |z — z^j < + and Gi = max{V/jy(y) : y G Si}. Assumption Al 
ensures that Gi is a finite number since Bi is compact. By taking e = 1 in ()89p . we see that 
€ -Bi for all i and t > M(l). As a result, we have 

Jt^^o{t)\ = \ Yl ai^jit){xj-x,J + Vf^,{xi,)\<2{n-l)a*{d, + l)+Gi. (91) 
i6Mo(<^{t)) 

Combining ([90|) and (f9T|) . we conclude that 

NoWlargmin/., > ^' ie[tfc,t. + T], (92) 

for all /c = 1, . . . , where by definition r 



!(2{n-l)a* ((i,+l)+Gi) 

Now we introduce 



Ds = min|/,o(y) - ^(z*) : \x^oit)\^^^^■^J^^ > - and y G Sij. 



Then we know Ds > again by the continuity of /j^. According to ([7^ . (jS^ . and we 
obtain 

< 2(iV- l)a*(d* +e)e-D5, (93) 



^F,„(t) < 2(A^ - l)a* ( - y,o(t) + (4 + e)yV^) + (z,) - (x,o(t)) 



for t G [tkitk + t], k = 1, . . . . This leads to 

V^,{tk + r) < yio(tfc) + (2{N - l)a*{d,+e)e - Ds)t 
<d,+e + (2{N - l)a*{d, + e)e - Ds^t 

< (94) 
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as long as e is sufficiently small so that 

e{l + 2{N - l)a*(4 + e)) < Dst. 
We see that contradicts ([89|l . The desired conclusion thus follows. 
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